Analysis of a Vector Space Model, Latent Semantic Indexing and Formal Concept Analysis for Information Retrieval

نویسندگان

  • Ch. Aswani Kumar
  • M. Radvansky
  • J. Annapurna
چکیده

Latent Semantic Indexing (LSI), a variant of classical Vector Space Model (VSM), is an Information Retrieval (IR) model that attempts to capture the latent semantic relationship between the data items. Mathematical lattices, under the framework of Formal Concept Analysis (FCA), represent conceptual hierarchies in data and retrieve the information. However, both LSI and FCA use the data represented in the form of matrices. The objective of this paper is to systematically analyze VSM, LSI and FCA for the task of IR using standard and real life datasets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Information Retrieval as Semantics Transformation

In this paper we present Information Retrieval as a semantics transformation problem. We describe a general theory for deriving concepts, and discuss 2 special cases: the vector model and the set model. The vector model leads to concepts derived by latent semantic indexing using the singular value decomposition. The set model leads to Formal Concept Analysis. We discuss the relation between the...

متن کامل

Comparison of Information Retrieval Techniques: Latent Semantic Indexing and Concept Indexing

The task of information retrieval is to extract relevant documents for a certain query from the collection of documents. As large sets of documents are now increasingly common, there is a growing need for fast and efficient information retrieval algorithms. The algorithms we are dealing with are embedded in the vector space model. In this paper we compare two information retrieval techniques: l...

متن کامل

A measure theoretic approach to information retrieval

The vector space model of information retrieval is one of the classical and widely applied retrieval models. Paradoxically, it has been characterised by a discrepancy between its formal framework and implementable form. The underlying concepts of the vector space model are mathematical terms: linear space, vector, and inner product. However, in the vector space model, the mathematical meaning o...

متن کامل

Latent Semantic Indexing Based on Factor Analysis

The main purpose of this paper is to propose a novel latent semantic indexing (LSI), statistical approach to simultaneously mapping documents and terms into a latent semantic space. This approach can index documents more effectively than the vector space model (VSM). Latent semantic indexing (LSI), which is based on singular value decomposition (SVD), and probabilistic latent semantic indexing ...

متن کامل

Document Clustering: Before and After the Singular Value Decomposition

Document Clustering is an issue of measuring similarity between documents and grouping similar documents together. Information Retrieval (IR) is an issue of comparing query with a collection of documents to locate a set of documents relevant to a particular query. In the vector space IR model, a query is treated as a document which consists of a few terms. Therefore, in both clustering and retr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012